JobOlize - Headhunting by Information Extraction in the Era of Web 2.0
نویسندگان
چکیده
E-recruitment is one of the most successful ebusiness applications supporting both, headhunters and job seekers. The explosive growth of online job offers makes the usage of information extraction techniques to build up, e.g., job portals in a semiautomatic way a necessity. Existing approaches, however, hardly cope with the heterogeneous and semistructured nature of job offers nor do they consider potentials offered by Web 2.0 technologies. This paper proposes an information extraction system called “JobOlize”1, realized for arbitrarily structured IT job offers. To improve extraction quality, a hybrid approach is employed, combining existing NLPtechniques with a new form of context-driven extraction, incorporating layout, structure and content information. To allow users a proper adaptation of the extraction results while preserving the look and feel of the original Web pages, a rich client interface is provided. The improvements in extraction quality are justified on basis of a case study and the experiences gained are generalized and critically reflected by discussing lessons learned.
منابع مشابه
Automatisiertes Headhunting im Web 2.0: Zum Einsatz intelligenter Softwareagenten als Headhunting-Robots
Der Beitrag skizziert aus Entwurfsperspektive ein automatisiertes agentenbasiertes Headhunting-System für Online Social Networks (OSN). Dabei werden neben der grundsätzlichen Projektidee makround mikroökonomische Modellierungsaspekte des Headhunting-Systems beleuchtet. Im Ergebnis zeigt sich, dass deliberative intelligente Softwareagenten prinzipiell geeignet sind, Headhunting-Aktivitäten im Be...
متن کاملFamiliarity with and Use of Web 2.0 Tools in Library Services by Librarians Working at Iran, Tehran, and Shahid Beheshti Universities of Medical Sciences
Background and Aim: Web 2.0 technology has various usages in libraries all over the world. According to studies, however, it seems that this technology is rarely used in Iranian academic libraries. Therefore, the present study aims to determine the level of familiarity with and use of Web 2.0 tools among librarians working at Iran, Tehran, and Shahid Beheshti Universities of Medical Sciences. ...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملReviewing Shahid Beheshti University Scholars’ Presence in ResearchGate®
Background and Aim: Web 2.0 capabilities in research fields have provided numerous conveniences for scholars. This, has consent people to interact and share their publications with other scholars across the world. The purpose of the research is to study the presence of Shahid Beheshti University Scholars in ResearchGate. Method: Used approach in this paper is Scientometrics with Altmetrics met...
متن کاملData Extraction using Content-Based Handles
In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...
متن کامل